Sampling and Integration

${\bf \large \cal Hovhannes \ Grigoryan}\\ {\rm \small New \ York, \ NY}$

Abstract

Calculation of some observables can be done either through sampling or through integration. In many instances, the Monte Carlo sampling is superior to integration, especially in higher dimensions. In this part, we start by trivial example on how to sample points from the gaussian distribution, by transforming the variables inside the gaussian integral, in such a way that these new variables are uniformly distributed. Then we show how to sample points uniformly on a sphere and on a ball either using direct sampling or using Markov-chain algorithm. The latter algorithm allows to implement a random walk on a sphere or on a circle. We discuss how Metropolis acceptance probability in Markov-chain method can be used to sample points from an arbitrary distribution, and other competing algorithms using direct sampling. We also discuss rejection free tower sampling and Walker algorithms. We consider an example of inverse square root distribution, discuss problems associated with it, and how to solve them. We conclude by calculating volume and area of the d-dimensional hypersphere using Monte Carlo methods [1].


Content

  • Gaussian Distribution from Uniform Distribution

  • Direct way to sample points on a unit sphere

  • Sampling uniform distribution inside the unit ball

  • Metropolis Acceptance Probability

  • Direct Sampling Rejection Cut Algorithm

  • Tower Sampling Method (rejection free)

  • Walker Algorithm

  • Calculation of High Dimensional Integrals Using Monte Carlo Methods

References

[1] We closely follow the outline of lectures by W. Krauth, et al., "Statistical Mechanics: Algorithms and Computations". OUP Oxford, 2006

In [1]:
%%html
<style>
body, p, div.rendered_html { 
    color: black;
    font-family: "Times Roman", sans-serif;
    font-size: 12pt;
}
</style>
In [2]:
from __future__ import division
import random
import numpy as np
from itertools import combinations

%precision 20

%matplotlib inline
%config InlineBackend.figure_format = 'retina'

import matplotlib.pyplot as plt
from matplotlib import animation, rc, cm
from mpl_toolkits.mplot3d import axes3d, Axes3D

from IPython.display import HTML, Image, clear_output

plt.rcParams['figure.figsize'] = (14,8) 
plt.style.use('fivethirtyeight')
rc('animation', html='html5')

import warnings
warnings.filterwarnings('ignore')

import sys, os
sys.path.append(os.path.abspath(os.path.join('..')))

from utils.make_gif import make_gif, clear_data
from utils.html_converter import html_converter

Consider the example of estimating $\pi$ from P1. We can do this either by sampling or by integration:

  • Monte Carlo: $\frac{\pi}{4} \approx \frac{N_{hits}}{N_{trials}} = \frac{1}{N_{trials}} \sum_i O_i $
  • Integration: $\frac{\pi}{4} = \frac{\int^1_{-1}\int^1_{-1} dxdy~ O(x,y)\pi(x,y)}{\int^1_{-1}\int^1_{-1} dxdy~\pi(x,y)}$
  • $O(x,y) = \left\{\begin{array}{ll} 1 & \mbox{if $x^2+y^2 \leq 1$},\\ 0 & \mbox{if $x^2+y^2 > 1$}.\end{array}\right.$

where, $\pi(x,y)=1$ is uniform probability density inside the square. It is clear that Monte Carlo sampling methods should excel at higher dimensions.

Gaussian Distribution from Uniform Distribution

Consider 1D gaussian integral $I = \int^\infty_{-\infty}\frac{\text{d}x}{\sqrt{2\pi}}e^{-x^2/2}$, which is known to be 1. We calculate it as follows:

$I^2=\int^\infty_{-\infty}\frac{\text{d}y}{\sqrt{2\pi}} \int^\infty_{-\infty}\frac{\text{d}x}{\sqrt{2\pi}}e^{-(x^2+y^2)/2}=\int^{2\pi}_0 \frac {\text{d}\phi}{2\pi}\int^\infty_0 r e^{-r^2/2} \text{d}r =\int^{2\pi}_0 \frac {\text{d}\phi}{2\pi}\int^\infty_0 e^{-\psi} \text{d}\psi = \int^{2\pi}_0 \frac {\text{d}\phi}{2\pi}\int^1_0 \text{d}\Upsilon=1$,

where $x=r \cos\phi$, $y = r \sin \phi$, $\psi = r^2/2$, $\Upsilon = e^{-\psi}$

The variable transformations indicate a transformation between uniformly distributed random variables $\Upsilon, \phi$ and gaussian distributed random variables $x, y$. Changes of variables in integrals also apply to samples. This is a relation between the integration variables on the exponent of the gaussian and the gaussian distributed random variables.

In [3]:
def gauss_test(sigma):
    phi = random.uniform(0.0, 2.0 * np.pi)
    Upsilon = random.uniform(0.0, 1.0)
    Psi = - np.log(Upsilon)
    r = sigma * np.sqrt(2.0 * Psi)
    x = r * np.cos(phi)
    y = r * np.sin(phi)
    return [x, y]
In [4]:
# exact distrubution:
list_x = [i * 0.1 for i in range(-40, 40)]
list_y = [np.exp(- x ** 2 / 2.0) / (np.sqrt(2.0 * np.pi)) for x in list_x]

# sampled distribution:
n_sampled_pairs = 50000
data = []

for sample in range(n_sampled_pairs):
    data += gauss_test(1.0)

# graphics output
fig = plt.figure(figsize=(12,8))
plt.plot(list_x, list_y, color='b', label='exact', lw=0.5)
plt.hist(data, bins=150, density=True, color='r', histtype='step', label='sampled');
plt.legend()
plt.title('Sampling of the gaussian distribution')
plt.xlabel('$x$', fontsize=14)
plt.ylabel('$\pi(x)$', fontsize=14)
plt.show()
In [5]:
# sampled distribution:
n_sampled_pairs = 50000
data = []

for sample in range(n_sampled_pairs):
    data.append(gauss_test(1.0))

X = np.random.randn(n_sampled_pairs)
Y = np.random.randn(n_sampled_pairs)

fig = plt.figure(figsize=(8,8))
plt.scatter(X, Y, color='b', label='exact', lw=0.3, alpha=0.7, marker='x') # by 'exact' we mean generated by a better algorithm
plt.scatter(np.array(data)[:,0], np.array(data)[:,1], color='r', label='sampled', lw=0.3, alpha=0.7, marker='x')
plt.legend()
plt.title('Sampling of the gaussian distribution')
plt.axis('scaled')
plt.xlabel('$x$', fontsize=14)
plt.ylabel('$y$', fontsize=14)
plt.show()

Direct way to sample points on a unit sphere

As was shown above, change of variables in the integral affects the distribution of samples. Therefore,

$1 = I^d= \left(\frac{1}{\sqrt{2\pi}}\right)^d \int^{\infty}_{-\infty} \dots \int^{\infty}_{-\infty}\text{d}x_0 \dots \text{d}x_{d-1} e^{-(x_0^2+ \dots + x^2_{d-1})/2} = \left(\frac{1}{\sqrt{2\pi}}\right)^d \int^\infty_{0} \text{d}r~ r^{d-1} e^{-r^2/2} \int \text{d}\Omega $

where we went from cartesian $(x_0, \dots, x_{d-1})$ to polar coordinates $(r, \Omega)$. Since angles in $\Omega$ are uniformly distributed, we can implement the following algorithm to uniformly sample the surface of the sphere.

In [6]:
%%time
x_list, y_list, z_list = [], [], []
nsamples = 6000

for sample in range(nsamples):
    x, y, z = random.gauss(0.0, 1.0), random.gauss(0.0, 1.0), random.gauss(0.0, 1.0)
    radius = np.sqrt(x ** 2 + y ** 2 + z ** 2)
    x_list.append(x / radius)
    y_list.append(y / radius)
    z_list.append(z / radius)

# graphics output
fig = plt.figure(figsize=(10,10))
ax = fig.gca(projection='3d')
ax.set_aspect('equal')
plt.plot(x_list, y_list, z_list, '.')
plt.title('Uniform sampling of the sphere surface')
ax.set_xlabel('$x$', fontsize=14)
ax.set_ylabel('$y$', fontsize=14)
ax.set_zlabel('$z$', fontsize=14)
plt.show()
Wall time: 318 ms

Another implementation of the same code.

In [7]:
%%time
nsamples = 6000

xyz = np.random.randn(nsamples, 3)
lens = np.linalg.norm(xyz, axis=1) 
xyz_normed = xyz/lens.reshape(-1,1)
x_list, y_list, z_list = xyz_normed[:, 0], xyz_normed[:, 1], xyz_normed[:, 2]
    
# graphics output
fig = plt.figure(figsize=(10,10))
ax = fig.gca(projection='3d')
ax.set_aspect('equal')
plt.plot(x_list, y_list, z_list, '.')
plt.title('Uniform sampling of the sphere surface')
ax.set_xlabel('$x$', fontsize=14)
ax.set_ylabel('$y$', fontsize=14)
ax.set_zlabel('$z$', fontsize=14)
plt.show()
Wall time: 298 ms

Markov-chain algorithm on the surface of a d-dimensional unit sphere

In [8]:
d = 3  # dimension        
delta = 0.2  # step size
sigma = 1.0/np.sqrt(d)

n_samples = 1000
data = []

x = np.random.randn(d)
x /= np.linalg.norm(x)

for _ in range(n_samples):
    x = x.copy()
    eps = sigma * np.random.randn(d)
    P = np.dot(eps, x)
    eps -= P * x
    eps /= np.linalg.norm(eps)
    Upsilon = random.uniform(-delta, delta)
    x += Upsilon * eps
    x /= np.linalg.norm(x)
    data.append(x)
In [9]:
adata = np.array(data)
x_list, y_list, z_list = adata[:, 0], adata[:, 1], adata[:, 2]
    
# graphics output
fig = plt.figure(figsize=(10,10))
ax = fig.gca(projection='3d')
ax.set_aspect('equal')
plt.plot(x_list, y_list, z_list, '.')
plt.title('Uniform sampling of the sphere surface')
ax.set_xlabel('$x$', fontsize=14)
ax.set_ylabel('$y$', fontsize=14)
ax.set_zlabel('$z$', fontsize=14)
plt.show()

Random move on a surface of a sphere

In [10]:
def plot3D_markov_sphere(old_data, new_point, step, save=False):
    fig = plt.figure(figsize=(10, 10))
    grid = plt.GridSpec(8, 4, hspace=0.3, wspace=0.1)

    # set up the axes for the first plot
    ax = fig.add_subplot(grid[:-1, 1:], projection='3d')

    # draw sphere
    u, v = np.mgrid[0:2*np.pi:200j, 0:np.pi:100j]
    x = np.cos(u)*np.sin(v)
    y = np.sin(u)*np.sin(v)
    z = np.cos(v)
    ax.plot_surface(x, y, z, rstride=1, cstride=1, cmap=cm.coolwarm,
                           linewidth=0, antialiased=False, alpha=0.2)

    # keep past points
    xa, ya, za = np.array(X), np.array(Y), np.array(Z)
    ax.plot(xa, ya, za, 'k.', alpha=0.4)       

    # draw new point
    x_n, y_n, z_n = x_new
    ax.plot([x_n], [y_n], [z_n], 'ro')
    plt.title('Random walk on the sphere t='+step, fontsize=18)
    ax.set_xlabel('$x$', fontsize=18)
    ax.set_ylabel('$y$', fontsize=18)
    ax.set_zlabel('$z$', fontsize=18)
    ax.set_xlim(-1, 1)
    ax.set_ylim(-1, 1)
    ax.set_zlim(-1, 1)
    ax.set_aspect('equal')

    theta = np.arccos(za/np.sqrt(xa**2 + ya**2 + za**2))
    phi = np.arctan(ya/xa)
    tt = np.linspace(0, np.pi, 100)
    dist_theta = 0.5*np.sin(tt)
    
    ax2 = fig.add_subplot(grid[:-1, 0])
    ax2.hist(theta, bins=20, density=True, histtype='stepfilled', orientation='horizontal')
    ax2.plot(dist_theta, tt, 'k--', label="$0.5 \sin(\\theta)$")
    ax2.set_ylim(0, np.pi)
    ax2.legend(loc="upper right")
    ax2.invert_yaxis()
    plt.title('Polar angle ($\\theta$)', fontsize=18)

    ax3 = fig.add_subplot(grid[-1, 1:])
    ax3.hist(phi, bins=20, density=True, histtype='stepfilled', orientation='vertical')
    ax3.set_xlim(-np.pi/2, np.pi/2)
    ax3.invert_xaxis()
    plt.title('Azimuthal angle ($\phi$)', fontsize=18)
    if save:
        plt.savefig('../data/random_walk_sphere_t='+step+'.png', transparent=True)
    plt.show()


def plot2D_markov_sphere(old_data, new_point, step, save=False):
    plt.rcParams['figure.figsize'] = (18, 10) 

    # set up a figure twice as wide as it is tall
    fig = plt.figure(figsize=plt.figaspect(0.5))

    # set up the axes for the first plot
    ax = fig.add_subplot(1, 2, 1)

    # keep past points
    X, Y = old_data
    xa, ya = np.array(X), np.array(Y)
    ax.plot(xa, ya, 'k.', alpha=0.4)       

    # draw new point
    x_n, y_n = new_point
    ax.plot([x_n], [y_n], 'ro')
    plt.title('Random walk on the sphere t='+step, fontsize=18)
    ax.set_xlabel('$x$', fontsize=18)
    ax.set_ylabel('$y$', fontsize=18)
    ax.set_xlim(-1.2, 1.2)
    ax.set_ylim(-1.2, 1.2)

    ax2 = fig.add_subplot(1, 2, 2)
    phi = np.arctan(ya/xa)
    ax2.hist(phi, bins=20, density=True)
    plt.title('Azimuthal angle ($\phi$) at t='+step, fontsize=18)

    if save:
        plt.savefig('../data/random_walk_sphere_t='+step+'.png', transparent=True)
    plt.show()
In [11]:
def markov_sphere_move(x, delta=0.2, d=3):
    sigma = 1.0/np.sqrt(d)
    eps = sigma * np.random.randn(d)
    P = np.dot(eps, x)
    eps -= P * x
    eps /= np.linalg.norm(eps)
    u = random.uniform(-delta, delta)
    x += u * eps
    x /= np.linalg.norm(x)
    return x
In [ ]:
tmax = 100
maxlength = len(str(tmax - 1))

delta = 0.5
d = 3

x = np.random.randn(d)
x_init = x/np.linalg.norm(x)

X, Y, Z = [], [], []

for t in range(tmax):
    number_string = str(t).zfill(maxlength)
    x_new = markov_sphere_move(x_init, delta, d)
    x_init = x_new
    
    X.append(x_new[0])
    Y.append(x_new[1])
    Z.append(x_new[2])
    
    if t % 10 == 0:
        clear_output(wait=True)
        plot3D_markov_sphere((X, Y, Z), x_new, number_string, save=True)
In [ ]:
make_gif('random_walk_sphere_t', 0.0)
In [13]:
%%html

<div style="display: flex; justify-content: row;">
    <img src='data/random_walk_sphere_t.gif'>
</div>

Random walk on a circle

In [17]:
tmax = 500
maxlength = len(str(tmax - 1))

delta = 1/np.sqrt(7.0)
d = 2

x = np.random.randn(d)
x_init = x/np.linalg.norm(x)

X, Y = [], []

for t in range(tmax):
    number_string = str(t).zfill(maxlength)
    x_new = markov_sphere_move(x_init, delta, d)
    x_init = x_new
    
    X.append(x_new[0])
    Y.append(x_new[1])
    
    clear_output(wait=True)
    plot2D_markov_sphere((X, Y), x_new, number_string)

Sampling uniform distribution inside the unit ball

To uniformly sample the inside of a ball, we need to sample radial points using distribution:

$$ \pi(r) = d r^{d-1}, \ \ \ 0 < r < 1 $$

We can sample these points through Uniform$([0,1])^{1/d}$, using universality of uniform theorem, which states the following:

  • For r.v. $Y = F_X(X)$, we have $F_Y(u) = F_U(u)$ or $Y \sim Uniform([0,1])$.
  • For r.v. $Z = F_X^{-1}(U)$, we have $F_Z(u) = F_X(u)$ or $Z \sim dist(X)$.

Indeed, the CDF corresponding to $\pi(r)$ is:

$$ F_R(r) = r^{d} \ \implies F_R^{-1}(u) = u^{1/d} \sim \pi(R) $$
In [18]:
x_list, y_list, z_list = [],[],[]
nsamples = 6000

for sample in range(nsamples):
    x, y, z = random.gauss(0.0, 1.0), random.gauss(0.0, 1.0), random.gauss(0.0, 1.0)
    length = random.uniform(0.0, 1.0) ** (1.0 / 3.0) / np.sqrt(x ** 2 + y ** 2 + z ** 2)
    x, y, z = x * length, y * length, z * length
    x_list.append(x)
    y_list.append(y)
    z_list.append(z)

# graphics output
fig = plt.figure(figsize=(10,10))
ax = fig.gca(projection='3d')
ax.set_aspect('equal')
plt.title('Uniform sampling inside the sphere')
ax.set_xlabel('$x$', fontsize=14)
ax.set_ylabel('$y$', fontsize=14)
ax.set_zlabel('$z$', fontsize=14)
plt.plot(x_list, y_list, z_list, '.')
plt.show()

Metropolis Acceptance Probability

The essence of Metropolis acceptance probability: perform a random uniform sampling, and accept the new sample with probability:

$$p(a\to b) = \min(1, \pi(b)/\pi(a)),$$

where $a$ is the initial state, $b$ is the new state, and $\pi(a)$ is the probability distribution.

Generating Samples from Given Distribution

Computationally, to generate a sample from $\pi(x)$ distribution,

  • Start from initial state $a$, and randomly (uniformly) move to potential state $b$

  • Generate a uniformly distributed random variable $\Upsilon \sim \text{Uniform}([0, 1])$

  • If $\Upsilon < \pi(b)/\pi(a),$ accept the move to a new state $b$, otherwise, reject it.

In [19]:
x = 0.0
delta = 0.5
data = []
N = 50000

for k in range(N):
    x_new = x + random.uniform(-delta, delta)
    if random.uniform(0.0, 1.0) < np.exp(- x_new ** 2 / 2.0) / np.exp(- x ** 2 / 2.0): 
        x = x_new 
    data.append(x)

t = np.linspace(-3.5, 3.5, N) 
y = np.exp(- t ** 2 / 2.0) / np.sqrt(2.0 * np.pi) 

fig = plt.figure(figsize=(12,8))
plt.hist(data, 100, density = 'True', label="generated samples")
plt.plot(t, y, c='red', linewidth=2.0, label="$exp(-x^2/2)/\sqrt{2\pi}}$")
plt.title('Theoretical Gaussian distribution $\pi(x)$ and \
    \nnormalized histogram for '+str(len(data))+' samples', fontsize = 18)
plt.xlabel('$x$', fontsize = 18)
plt.ylabel('$\pi(x)$', fontsize = 18)
plt.legend(loc="upper right")
plt.show()

Direct Sampling Rejection (Cut) Algorithm

Drop points into rectangle that frames the original distribution.

N.B. Rejections here are related to direct sampling algorithm.

In [20]:
y_max = 1.0 / np.sqrt(2.0 * np.pi)
x_cut = 3.5

n_data = 10000
n_accept = 0
data = []
X, Y = [], []

while n_accept < n_data:
    y = random.uniform(0.0, y_max)
    x = random.uniform(-x_cut, x_cut)
    if y < np.exp( - x **2 / 2.0) / np.sqrt(2.0 * np.pi): 
        n_accept += 1
        data.append(x)
        X.append(x)
        Y.append(y)
        
t = np.linspace(-3.5, 3.5, N) 
yt = np.exp(- t ** 2 / 2.0) / np.sqrt(2.0 * np.pi) 

fig = plt.figure(figsize=(12,8))
plt.hist(data, 100, density = 'True', label="x histogram of samples", alpha=0.8)
plt.scatter(X, Y, color='black', marker='.', linewidths=0.1, alpha=0.6, label="(x,y) point samples")

plt.plot(t, yt, c='red', linewidth=2.0, label="$exp(-x^2/2)/\sqrt{2\pi}}$")
plt.title('Theoretical Gaussian distribution $\pi(x)$ and \
    \nnormalized histogram for '+str(len(data))+' samples', fontsize = 18)
plt.xlabel('$x$', fontsize = 18)
plt.ylabel('$\pi(x)$', fontsize = 18)
plt.xlim([-x_cut, x_cut])
plt.ylim([0, y_max])
plt.legend(loc="upper right")
plt.show()

Dealing with inverse square root distribution

This method fails, e.g., for the inverse sqrt distribution, because we don't know what box size to choose. For larger boxes the rejection rate would be very large rendering our algorithm inefficient.

In [21]:
y_max = 100.0
x_cut = 1.0
n_data = 10000
data = []
n_accept = 0

while n_accept < n_data: 
    y = random.uniform(0.0, y_max)
    x = random.uniform(0.0, x_cut)
    if y < 1.0 / (2.0 * np.sqrt(x)):
        n_accept += 1
        data.append(x)

plt.hist(data, bins=100, density='True', label="samples")

t = np.linspace(0.01, 1, n_data)
yt = 1.0 / (2.0 * np.sqrt(t))

plt.plot(t, yt, 'red', linewidth = 2, label="$1/(2\sqrt{x})$")
plt.title('Theoretical distribution $\pi(x)={1}/{(2 \sqrt{x})}$ and normalized\
    \n histogram for '+str(n_accept)+' accepted samples',fontsize=16)
plt.xlabel('$x$', fontsize=18)
plt.ylabel('$\pi(x)$', fontsize=18)
plt.legend(loc="upper right")
plt.show()

In this respect MC algorithm is better, since the Markov-chain algorithm does not have problems with infinite probability densities, and it doesn't need to specify any box size. In the algorithm below, we perform Markov-chain sampling with Metropolis acceptance probability, and record the maximal value of $y$ and minimal value of $x$. We can observe, that the algorithm is able to move to very small values of $x$ very rapidly.

In [22]:
x = 0.2
delta = 0.5
data = []
y_max = 0
n_trials = 100000

for k in range(n_trials):
    x_new = x + random.uniform(-delta, delta)
    if 0.0 < x_new < 1.0:
        if random.uniform(0.0, 1.0) < np.sqrt(x) / np.sqrt(x_new):
            x = x_new
    if 0.5 / np.sqrt(x) > y_max:
        y_max = 0.5 / np.sqrt(x)
        print(y_max, x, k)
    data.append(x)

    
t = np.linspace(0.01, 1, n_data)
yt = 1.0 / (2.0 * np.sqrt(t))

fig = plt.figure(figsize=(12,8))
plt.hist(data, bins=100, density='True', label="samples")
plt.xlabel('$x$', fontsize=16)
plt.ylabel('$\pi(x)$', fontsize=16)
plt.plot(t, yt, linewidth=1.5, color='r', label="exact")
plt.title('Theoretical distribution $\pi(x)={1}/{(2 \sqrt{x})}$ and normalized\
    \n histogram for ' + str(len(data)) + ' samples', fontsize=16)
plt.legend(loc="upper right")
plt.show()
1.118033988749895 0.2 0
1.5289460757276208 0.10694381472885406 6
1.567971677890471 0.1016865659529983 11
1.7345607098586362 0.08309234182788539 17
1.752074164333056 0.0814394885048888 35
1.8459846970022364 0.073364137433649 43
3.423393499006586 0.02133174411445482 54
33.42993251540596 0.0002237015580436319 90
36.936135592539394 0.00018324709434791586 1133
46.893125892281304 0.00011368983630988261 3410
246.22231415836816 4.123682228573955e-06 7983

Tower Sampling Method (rejection free)

Assume that the probability to do either of the activities $(a, b, c, d)$ is $\pi_a, \pi_b, \pi_c, \pi_d$. Then we can use the direct sampling "cut" algorithm discussed above, to determine which activity to select. We simply throw a point into the rectangle that contains given the probabilities, and select the activity, if it is below the "curve". However, this method may have a high rejection rate. That is why it is more efficient to use the rejection free tower sampling algorithm. In this algorithm, we place probabilities on top of each other, like in a tower, and throw a point uniformly into this tower.

In [23]:
def bisection_search(eta, w_cumulative):
    """Bisection search to find the bin corresponding to eta."""
    kmin = 0
    kmax = len(w_cumulative)
    while True:
        k = int((kmin + kmax) / 2)
        if w_cumulative[k] < eta:
            kmin = k
        elif w_cumulative[k - 1] > eta:
            kmax = k
        else:
            return k - 1

def tower_sample(weights):
    """Sample an integer number according to weights."""
    sum_w = sum(weights)
    w_cumulative = [0.0]

    for l in range(len(weights)):
        w_cumulative.append(w_cumulative[l] + weights[l])
    
    eta = random.random() * sum_w
    sampled_choice = bisection_search(eta, w_cumulative)
    return sampled_choice
In [24]:
weights = [0.4, 0.3, 0.8, 0.1, 0.2]
n_samples = 20
for sample in range(n_samples):
    print(tower_sample(weights), end="--")
4--0--2--1--2--2--2--2--2--1--4--2--2--1--2--2--0--2--2--0--

Walker Algorithm

Although the tower sampling algorithm is rejection free it is not optimal. That is why we will consider Walker algorithm. In this algorithm, assume probabilities are boxes of unit width and height that is equal to the probability. The algorithm goes as follows:

  • First compute the average probability

  • Put the largest box on top of the smallest box, and return the part above the average line back to its place

  • Continue the procedure above until all boxes are of average height, and at most two boxes are in each place

  • Sample a point into these boxes to determine the activity

However, Walker algorithm is optimal for discrete distributions.

In [26]:
# There are N=5 options with probability pi(a)=[prob, number]

N = 5

pi = [[1.1 / 5.0, 0], [1.9 / 5.0, 1], [0.5 / 5.0, 2], [1.25 / 5.0, 3], [0.25 / 5.0, 4]]

x_val = [a[1] for a in pi]
y_val = [a[0] for a in pi]

# average probability
pi_mean = sum(y_val) / float(N)

long_s = []   
short_s = []

for p in pi:
    if p[0] > pi_mean:
        long_s.append(p)
    else:
        short_s.append(p)

table = []
for k in range(N - 1):
    e_plus = long_s.pop()
    e_minus = short_s.pop()
    table.append((e_minus[0], e_minus[1], e_plus[1]))
    e_plus[0] = e_plus[0] - (pi_mean - e_minus[0])
    if e_plus[0] < pi_mean:
        short_s.append(e_plus)
    else:
        long_s.append(e_plus)

if long_s:
    table.append((long_s[0][0], long_s[0][1], long_s[0][1]))
else:
    table.append((short_s[0][0], short_s[0][1], short_s[0][1]))

print(table)

samples = []
n_samples = 10000

for k in range(n_samples):
    Upsilon = random.uniform(0.0, pi_mean)
    i = random.randint(0, N - 1)
    if Upsilon < table[i][0]:
        samples.append(table[i][1])
    else:
        samples.append(table[i][2])

plt.figure()
plt.hist(samples, bins=N, range=(-0.5, N - 0.5), normed=True)
plt.plot(x_val, y_val, 'ro', ms=8)
plt.title("Histogram using Walker's method for a discrete distribution\n\
             on $N=$" + str(N) + " choices (" + str(n_samples) + " samples)", fontsize=18)
plt.xlabel('$k$', fontsize=20)
plt.ylabel('$\pi(k)$', fontsize=20)
plt.show()
[(0.05, 4, 3), (0.09999999999999998, 3, 1), (0.1, 2, 1), (0.17999999999999997, 1, 0), (0.19999999999999998, 0, 0)]

Tower Sampling Continuum Limit: Universality of Uniform

In [28]:
import scipy.special

n_trials = 100000
data = []

for trial in range(n_trials):
    Upsilon = random.uniform(0.0, 1.0)
    x = np.sqrt(2.0) * scipy.special.erfinv(2.0 * Upsilon - 1.0)
    data.append(x)

t = np.linspace(-4, 4, n_trials)

plt.plot(t, np.exp( - t **2 / 2.0) / np.sqrt(2.0 * np.pi))
h = plt.hist(data, bins=100, range=(-4, 4), density=True)
plt.title("Generating standard normal distribution from uniform using inverse CDF", fontsize=18)
plt.show()

Inverse square root distribution revised

In [29]:
gamma = -0.5
n_trials = 10000
data = []

for _ in range(n_trials):
    u = random.uniform(0, 1)
    x = u ** (1.0/(gamma + 1.0)) 
    data.append(x)


t = np.linspace(0.01, 1, n_data)
yt = 1.0 / (2.0 * np.sqrt(t))

fig = plt.figure(figsize=(12,8))
plt.hist(data, bins=100, density='True', label="samples")
plt.xlabel('$x$', fontsize=16)
plt.ylabel('$\pi(x)$', fontsize=16)
plt.plot(t, yt, linewidth=1.5, color='r', label="exact")
plt.title('Theoretical distribution $\pi(x)={1}/{(2 \sqrt{x})}$ and normalized\
    \n histogram for ' + str(len(data)) + ' samples', fontsize=16)
plt.legend(loc="upper right")
plt.show()

Calculation of High Dimensional Integrals Using Monte Carlo Methods

Volume of the $d$-dimensional hypersphere $V_\text{d}(r)$

  • Surface area of $d$-sphere, boundary of $(d+1)$-ball, of radius $r$: $ \ \ \ A_d(r)=\frac{2\pi^{(d+1)/2}}{\Gamma\left(\frac{d+1}{2}\right)}r^{d} = (2\pi r)V_{d-1}(r)$
  • Volume of $d$-ball of radius $r$: $ \ \ \ V_d(r)=\int_0^r S_{d-1}(r') \text{d}r' = \frac{\pi^{d/2}}{\Gamma\left(\frac{d}{2}+1\right)}r^d = \frac{r}{d}A_{d-1}(r) \sim \frac{1}{\sqrt{\pi d}}\left(\frac{2\pi e}{d}\right)^{d/2}r^d $

Markov-chain Monte Carlo program to compute the quantity:

$Q(d) \equiv 2V_{d, sph}(1)/V_{d, cyl}(1) = V_{d, sph}(1)/V_{d-1, sph}(1) = \sqrt{\pi} \frac{\Gamma\left(\frac{d+1}{2}\right)}{\Gamma\left(\frac{d}{2}+1\right)}$

In [30]:
# samples points (x, y) inside the unit disk, rather than inside the square
# at each step, sample a new value of z as random.uniform(-1.0, 1.0), and count as a "hit" if x^2 + y^2 + z^2 < 1.0.

x, y = 0.0, 0.0  # initial position
delta = 0.5      # step size

N_R = 400000      # number of steps
N_C = 0          # number of times we are inside the circle 

for _ in range(N_R):
    z = random.uniform(-1.0, 1.0)
    dx, dy = random.uniform(-delta, delta), random.uniform(-delta, delta)
    if (x + dx)**2 + (y + dy)**2 < 1.0:
        x, y = x + dx, y + dy
   
    if x ** 2 + y ** 2 + z ** 2 < 1.0: 
        N_C += 1

Q_3 = 2*N_C/N_R
print("<Q(3)> is:", Q_3)
<Q(3)> is: 1.33222

Computing $V_{4}(1)$

In [31]:
# sample points (x,y,z) inside the 3D unit sphere.
# at each step, sample a w as random.uniform(-1.0, 1.0), and count as a "hit" if (x^2 + y^2 + z^2 + w^2) < 1.

x, y, z = 0.0, 0.0, 0.0  # initial position
delta = 0.5              # step size

N_R = 400000      # number of steps
N_C = 0          # number of times we are inside the circle 

for _ in range(N_R):
    w = random.uniform(-1.0, 1.0)
    dx, dy, dz = random.uniform(-delta, delta), random.uniform(-delta, delta), random.uniform(-delta, delta)
    if (x + dx)**2 + (y + dy)**2 + (z + dz)**2 < 1.0:
        x, y, z = x + dx, y + dy, z + dz
   
    if x ** 2 + y ** 2 + z ** 2 + w ** 2 < 1.0: 
        N_C += 1

Q_4 = 2*N_C/N_R
V_3 = 4*np.pi/3.0
V_4 = V_3 * Q_4
print("Estimated <Q(4)> =", Q_4, ", exact Q(4) =", (np.pi**2/2)/V_3)
print("Estimated <V(4)> =", V_4, ", exact V(4) =", np.pi**2/2)
Estimated <Q(4)> = 1.178025 , exact Q(4) = 1.1780972450961724
Estimated <V(4)> = 4.934499580993488 , exact V(4) = 4.934802200544679

Computing $V_{d}(1)$

In [32]:
def rand_uniform_sphere(d, N):
    """ Uniformly samples points inside d-D unit sphere. 
        At each iteration we modify only one component.
        Returns: array of N samples (rows) and d columns
    """
    x = [0.0]*d      
    old_radius_square = 0.0
    delta = 0.5              
    samples = []
    rads = []
    for _ in range(N):
        k = random.randint(0, d - 1)
        step = random.uniform(-delta, delta)
        x_old_k = x[k]
        x_new_k = x_old_k + step
        new_radius_square = old_radius_square + x_new_k ** 2 - x_old_k ** 2
        if new_radius_square < 1.0:
            x[k] = x_new_k
            old_radius_square = new_radius_square
        samples += x  # appending lists in iteration causes problems, this is why we extend list
        rads.append(np.sqrt(old_radius_square))
    return np.array(samples).reshape(-1, d), rads


def Q(d, N):
    D = d-1
    x = [0.0]*D    
    old_radius_square = 0.0
    delta = 0.5              
    samples = []
    N_C = 0
    for _ in range(N):
        z = random.uniform(-1, 1)
        k = random.randint(0, D - 1)
        step = random.uniform(-delta, delta)
        x_old_k = x[k]
        x_new_k = x_old_k + step
        new_radius_square = old_radius_square + x_new_k ** 2 - x_old_k ** 2
        if new_radius_square < 1.0:
            x[k] = x_new_k
            old_radius_square = new_radius_square
        if old_radius_square + z**2 < 1:
            N_C += 1
    return 2*N_C/float(N)


def vol_sphere_exact(d):
    return np.pi**(d/2.0)/np.math.gamma(d/2.0 + 1.0)


def vol_sphere(d, N):
    V_2 = np.pi
    if d == 2:
        return V_2
    elif d < 2:
        raise ValueError("Number of dimensions should be no less than 2!!")
    else:
        V_d = V_2
        
    for i in range(3, d+1):
        V_d *= Q(i, N)
    
    V_d_exact = vol_sphere_exact(d)
    error = round(100*(V_d - V_d_exact)/V_d_exact, 4)
    print(f"For d = {d}, estimated <V_d> = {V_d}, exact V_d = {V_d_exact}, error = {error}%")
    return V_d
        
           
V_d = vol_sphere(200, 100000) 
For d = 200, estimated <V_d> = 5.035018841366997e-109, exact V_d = 5.5588328420278045e-109, error = -9.4231%
In [33]:
diffs = []
Ns = [10**k for k in range(2, 7)]
for N in Ns:
    V_d = vol_sphere(20, N) 
    V_d_exact = vol_sphere_exact(20)
    diffs.append(V_d_exact - V_d)
For d = 20, estimated <V_d> = 0.021761070224337938, exact V_d = 0.02580689139001405, error = -15.6773%
For d = 20, estimated <V_d> = 0.026378148126377243, exact V_d = 0.02580689139001405, error = 2.2136%
For d = 20, estimated <V_d> = 0.02647224075792509, exact V_d = 0.02580689139001405, error = 2.5782%
For d = 20, estimated <V_d> = 0.027355690934349745, exact V_d = 0.02580689139001405, error = 6.0015%
For d = 20, estimated <V_d> = 0.026167105605964653, exact V_d = 0.02580689139001405, error = 1.3958%

Checking that samples are uniform and correctly distributed in $r$

In [35]:
samples, rads = rand_uniform_sphere(2, 10000)

h = plt.hist(rads, bins=100, density=True)
In [36]:
s = plt.scatter(samples[:, 0], samples[:, 1])
plt.axis('scaled')
Out[36]:
(-1.0976077428041813,
 1.0939292892327455,
 -1.0991575643986282,
 1.097714255212535)
In [37]:
N = 1000000
samples, rads = rand_uniform_sphere(20, N)
t = np.linspace(0, 1, N)

h = plt.hist(rads, bins=50, density=True)
plt.plot(t, 20*t**19)
Out[37]:
[<matplotlib.lines.Line2D at 0x1c1e6c86ba8>]
In [38]:
volume, dimension = [], []

def V_sph(dim):
    return np.pi ** (dim / 2.0) / np.math.gamma(dim / 2.0 + 1.0)

for d in range(0,20):
    dimension.append(d)
    volume.append(V_sph(d))

plt.plot(dimension, volume, 'bo-')
plt.title('Volume of the unit hypersphere in $d$ dimensions')
plt.xlabel('$d$', fontsize = 15)
plt.ylabel('$V_{sph}(d)$', fontsize = 15)
plt.xlim(0,20)
plt.ylim(0,6)
for i in range(0,20):
    plt.annotate(round(volume[i],2), (dimension[i], volume[i]), xytext=(10, 0), ha='left',
                textcoords='offset points')
plt.grid(True)

Convert jupyter notebook into the html file with in a given format

In [ ]:
notebook_file = r"P3_Sampling_and_Integration.ipynb"
html_converter(notebook_file)